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IT) (54) Title: METHODS AND SYSTEMS FOR AUTOMATED DATA PROCESSING 

Tj- (57) Abstrac t: Emb odiments of the present invention are directed to methods and systems for processing and/or validating data 
O^uglmgjgiigfil^^ may include arranging a plurality of nodes in a graph, where 

1^ each node represents at least one processing step for processing data by a processor and wherein at least one of the plurality of 
O "0<Jes comprise at least one data retrieval node for retrieving data for validation. The method may also include establishing at Jeast 
O one output from substantially all of the plurality of nodes, except for the at least one data retrieval node, establishing at least one 
^ input to each of the plurality of nodes, configuring one or more parameters of each node, and linking at least one output of each 
Q of substantially all of the plurality of nodes to an input of another node, where each link representing a data flow. The method 

may further include sequencing a dependency among the plurality of nodes and establishing processing logic in at least one node to 

process data in a predetermined manner. 
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METHODS AND SYSTEMS 
FOR AUTOMATED DATA PROCESSING 

Field of the Invention 

Embodiments of the present invention are related to methods and systems for 
5 processing and/or validating data, and more particularly, to methods and systems for 
validating data for revenue assurance. 

Background of the Invention 

111 many organizations data validation, whether for revenue assurance or any other 
purpose, is a difficult and error-prone task. For a wide array of reasons, business rules 
10 and/or logic used to validate data are often so complex that their implementation is 

manually intensive, resulting in tremendous inefficiencies of time and cost, as well as many 
possible human errors (e.g., typos). While these issues are quite common and well known, 
too many organizations continue to do revenue assurance without automated processes. 

In the past, where an automated or partly automated solution has been attempted, it 
15 has most often taken the form of scripts. SQL, shell, and other scripts comprise the vast 
majority of information technology (IT) leveraged revenue assurance solutions. Yet scripts 
and other obtuse programs create problems of their own, mostly stemming fi:om the fact 
that scripts are difficult to read and/or understand. Moreover, since scripts provide . 
virtually no means for complexity management, they often develop into tangled and 
20 complicated programs. As a result, scripts usually can only be modified (if at all) by the 
person who originally wrote them. However, even if they can be modified, every 
modification carries with it the risk of breaking the enfire script. Even additive changes 
risk altering preexisting functionality. In addition, since typically only the programmer 
understands the scripts, a subject matter expert, i.e., one who understands the 
25 processing/validation rules to be applied, cannot easily determine whether a script is 
drafted correctly. Thus, the creation of a correct script is difficult, time consuming and 
costly. 
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For example, since business rule requirements in current data validation methods 
must be documented with painstaking detail to mitigate communication risks, development 
moves slowly along with little regard for deadlines and testing must be methodical and 
lengthy. When scripts are completed, the business rules incorporated in the script most 
likely have changed. This lag is the fundamental failure of script-based solutions which 
results in inaccuracy of results, thus diminishing their value. 



SUMMARY OF THE INVENTION 

Embodiments of the invention address problems of prior art data 
processing/validation techniques and present novel systems and associated processes, 
which enable an iterative, collaborative process for implementing business rules and other 
logic (together rules) to process and/or validate data. Data processing may be defined, 
executed, analyzed and refined in minutes, and may be repeated until the rules are both 
precise and accurate, taking hours or days instead of months. The rules themselves are 
easily codified in visual flowcharts that are easy to read and understand by even non- 
technical personnel. 

Moreover, embodiments of the present invention inherently provide a basic level of 
documentation with no extra effort. For example, documentation may easily be effected 
using an HTML document with a complete audit trail of the last execution of a business 
rule graph, including all nodes, connections, parameters (fields), embedded source code, 
notes, statistics, execution times and duration, excerpts of data, and the like. 

hi effect, some embodiments of the invention allow a user to program a computer 
using a graphical user interface to draft a visual and working flowchart for data processing 
using a plurality of predefined nodes, each of which accomplish predefined and modifiable 
tasks. 

In one embodiment of the present invention, a method for processing data using a 
graphical user interface of a computer system is provided and may include arranging a 
plurality of nodes in a graph, where each node represents at least one processing step for 
processing data by a processor and wherein at least one of the plurality of nodes comprise 
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at least one data retrieval node for retrieving data for validation. The method may also 
include establishing at least one output jfrom substantially all of the plurality of nodes, 
except for the at least one data retrieval node, establishing at least one input to each of the 
plurality of nodes, configuring one or more parameters of each node, and linking at least 
one output of each of substantially all of the plurality of nodes to an input of another node, 
where each link representing a data flow. The method may further include sequencing a 
dependency among the plurality of nodes and establishing processing logic in at least one 
node to process data in a predetermined manner. 

In another embodiment of the invention, a system for processing data using a 
graphical user interface of a computer system is provided and may include arranging means 
for arranging a plurality of nodes in a graph-space, where each node represents at least one 
processing step for processing data and wherein at least one of the plurality of nodes 
comprise at least one data retrieval node for retrieving data for validation. The system may 
also include establishing means for establishing at least one output from substantially all of 
the plurality of nodes and for establishing at least one input to each of the plurality of 
nodes,'except for the at least one data retrieval node, configuring means for configuring 
one or more parameters of each node, and linking means for linking at least one output of 
each of substantially all of the plurality of nodes with an input of another node, where each 
link representing a data flow. The system may also include sequencing means for 
sequencing execution of one or more nodes and setup means for setting up processing logic 
in at least one node to process data in a predetermined manner. 

hi yet another embodiment of the invention, a system for processing data using a 
graphical user interface of a computer system is provided and may include an editor 
including a graphical user interface, a graphical workspace for designing a processing 
graph having a plurality of processing nodes, an execution file, where the execution file 
results from compiling the processing graph and a controller for directing the running of 
the execution file on one or more computers. 

Further embodiments may also include computer readable media having computer 
instructions for enabling a computer system to perform methods according to any of the 
embodiments of tlie invention. Other embodiments may include application programs for 
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enabling a computer system to perfonn the methods according to any of the embodiments 
of the invention. 

These and other embodiments, as well as fiirther objects and advantages of the 
present invention will become even more clear with reference to the following detailed 
description and attached figures, a brief description of which follows. 



5 BRIEF DESCRIPTION OF THE FIGURES 

Fig. 1 illustrates a block diagram of a system for processing and/or validating data 
according to an embodiment of the invention. 

Fig. 2 illustrates a workflow for BRAIN for processing and/or validating data 
according to an embodiment of the invention. 

1 0 Fig. 3 illustrates a screenshot of a graphical-user-interface (GUI) for use with an 

editor program for graphically programming a data processing and/or vaUdation process 
according to an embodiment of the invention. 

Fig. 4 illustrates a representative example of a graphical program/process, having a 
plurality of interconnected nodes for accomplishing a data processing/validation process. 

1 5 Fig. 5 illustrates a timing (clock) node for sequencing nodes of a graphical program 

according to an embodiment of the invention. 

Fig. 6 illustrates a parameter popup window for an editor program for editing 
parameters of an example node according to an embodiment of tlie invention. 

Fig. 7 illustrates a bundler node according to an embodiment of the present 
20 invention. 

Fig. 8 illustrates a composite node according to an embodiment of the present 
invention. 

Fig. 9 illustrates an example of a beginning stage of a development of a business 
rule graph according to an embodiment of the invention. 
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Fig. lOillustrates a parameter popup window for a type ofdata retrieval node node 
according to an embodiment of the present invention. 

Fig. 1 1 illustrates a parameter popup window for another type ofdata retrieval node 
according to an embodiment of the present invention. 

5 Fig. 1 1 illustrates a parameter popup window for detennining outputs of a node 

according to an embodiment of the present invention. 

Fig. 13 illustrates an example of a fiirflier stage of a development of a business rule 
graph according to an embodiment of the invention. 

Fig. 1 4 illustrates a parameter popup window for a concatenating node according to 
1 0 an embodiment of the present invention. 

Fig. 15 illustrates an example of yet a further stage of a development of a business 
rule graph according to an embodiment of the invention. 



15 



Figs. 1 6A-16C illustrate popup windows displays of results of processed data for 
node according to an embodiment of the present invention. 

Fig. 17 illustrates an example of still yet a further stage of a development of a 
business rule graph according to an embodiment of the invention. 

Fig. 18 illustrates a parameter popup window for a sorting node acconling to an 
embodiment of the present invention. 

Fig. 19 illustrates an example of still yet a further stage of a development of a 
20 business rule graph according to an embodiment of the invention. 

Fig. 20 illustrates a popup up window display for indicating join types of a join 
node according to an embodiment of the invention. 

Fig. 21 is a Venn diagram illustrating what data is sent to a particular output of a 
join node according to an embodiment of the invention. 



a 



25 



Fig. 22 illustrates a parameter window for indicating the scripting language for the 

join. 
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Fig. 23 illustrates an example of still yet a further stage of a development of a 
business rule graph according to an embodiment of the invention. 

Fig. 24 illustrates an example of still yet a further stage of a development of a 
business rule graph according to an embodiment of the invention. 

Fig. 25 illustrates a parameter popup window for a aggregating node according to 
an embodiment of the present invention. 

Fig. 26 illustrates an example of a completed initial development of a business rule 
graph according to an embodiment of the invention. 

Fig. 27 illustrates a parameter popup window for a database loading node according 
to an embodiment of the present invention. 

DETAILED DESCRIPTION OF THE EMBODIMENTS 

Embodiments of the present invention may be embodied in hardware (e.g., ASIC, 
processors and/or other integrated circuits), or software, or both. For illustrative purposes 
only, the embodiments of the invention will be described as being embodied in software 
operating on one or more computer systems, and preferably, operated over a computer 
network. Such a network may include one or more server computers and one or more 
workstation computers (a workstation may also operate as a server). 

In the detailed description which follows, embodiments of the invention will 
sometimes be described with reference to processing and/or validating data with respect to 
a telecommunications system. Such descriptions are meant as an example only and are not 
intended to limit the scope of the invention. 



BRAIN 

Embodiments of the present invention include a Business Rule Automation 
infrastructure (BRAIN) which combines powerful complexity management for processing 
data with an ability to use multiple processors (e.g., one or more) from a plurality of server 
computers (servers) in a scalable format. Embodiments of BRAIN may include one or 
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more of the following components: a business rule editor (BRE), a business rule graph 
(BRG), a business rule executable (BRX), a controller and a server farm operating one or 
more drones (a process for executing a task), 

BRAIN may be operated as part of a total system for processing and/or validating 
5 data. Such a system is illustrated in Fig. 1 . An example of such ^ system may be a revenue 
assurance system as disclosed in related pending U.S. patent application no. 10/356,254, 
filed January 31, 2003 (publication no. 20040153382), the entire disclosure of which is 
incorporated by reference in tlie present application. 

As shown in Fig. 1, BRAIN receives source data from a data warehouse. Such 
10 data, for a telecommimications system, may include operational support system data (OSS), 
business support system data (BSS) and reference data (for example). Using a woricstation, 
an end user (user) can use BRAIN to process and/or validate the source data to generate 
discrepancies and statistics, which may be stored in a database (e.g., "Data Storage"). The 
discrepancies may be researched and resolved by a user using the same or another 
15 workstation. In addition, a user can generate reports of the discrepancies and statistics 
(Revenue Assurance Management). It is understood that all interaction with BRAIN 
and/or the entire system iUustrated in Fig. 1 maybe accomplished using a single 
workstation. Fig. 1 merely illustrates one particular manner in which the system may be 
arranged for multiple users and/or locations using a networked environment and multiple 
20 workstations. 

Fig. 2 illustrates a workflow for BRAIN for processing data. As shown, a BRG is 
created by the BRE. The BRE is an editor application program, operational for at least one 
or more of creating, editing, refining, compiling, executing, testing and debugging of a 
BRG. A screenshot of the GUI according to some embodiment so of the invention is 
25 shown in Fig. 3. Primitives area 310 include a plurahty of objects (e.g., nodes) from a 

library that may be selected and used for/in a palate for a BRG, for perfomiing modifiable, 
predefmed tasks. 



30 



A BRG is a visual flowchart which may be used to arrange a plurality of nodes, 
each of which may be color coded (either via user preference or automatically by the BRE) 
and each of which may represent one or more processing steps/tasks to be performed for 



10 
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processing and/or validating data. Results from one node may be forwarded to another 
node for further processing or storage in a file or database. Fig, 4 illustrates a 
representative example of a BRG illustrating a plurality of interconnected nodes. BRGs 
may be created to accomplished, for example, generic particular tasks, and moreover, such 
BRGs may be used as templates for other BRGs for similar tasks. 

A completed BRG (for example) may be compiled (e.g., using the BiRE or other 
compiling application) to form a BRX, an executable file which may be then executed by 
the controller using the server farm. Each computer of the server farm may be used to 
execute the one or more particular tasks of the nodes using, for example, drones. 



. Nodes 

Nodes are used in the present invention to perform a wide variety of tasks and each 
preferably includes user definable parameters/fields. The definable parameters allow a 
node to be easily modified so that it may be able to perform a particular desired task. 
1 5 Moreover, a user may also define additional parameters for a node for additional 

customization. Tasks that may be performed by nodes include (for example): filtering, 
sorting, cross-referencing, aggregating, separating, reading, writing, and the like. 

In general, each node may include one or more inputs and one or more outputs, 
depending upon the type of node (i.e., the task that the node performs), and in some cases, 
20 nodes may not include an input or an output (or both). 

Each node may be configured to perform one or more predefined tasks preferably 
using a general purpose scripting language. Such a programming language preferably 
includes simple grammar and syntax similar to that of, for example. Lisp or Scheme. The 
semantics for a preferred language may include a collection of low-level fimctions and/or 
25 built-in operators. Moreover, the execution model for the preferred language may be 
similar to that of AWK, SED, or PERL. Accordingly, whichever language is used, the 
source code for the language should reside on the server farm and/or workstation so that 
scripted tasks may be executed. For embodiments of the present application, such a 



wo 2005/043356 PCTAJS2004/038086 

-9 - 

general purpose scripting language will be referred to as "Expert*' (e.g.. Expert language. 
Expert code). 

In tbat regard, each node may include modifiable, default Expert language to 
accomplish the task of the particular named node. For example, a filtering node may 
5 include the following default Expert language: 

#describing output #1 
(output 1 

(output-all-input-fields) # same fields as input 
10 ) 

This expression configures output #1 of the node, describing it as having all of the 
fields of the input. This particular example of a filtering node is a no-operation node — i.e., 
it simply writes every input record to the output. However, the Expert language may be 
1 5 modified so that records, for example, for a particular US state may be output (e.g., 
Massachusetts) as set out below: 

#describing output #1 
(output 1 

20 (output-if (equals 'state' ^'MA")) # MA only 

(output-all-input-fields) # same fields as input 

) 

It is worth notmg that this example of Expert language for a filtering node is not 
25 restricted to a particular type of input - it may be used where any input field named "state" 
is used. In some embodiments, a constraint may be included in the scripting that inputs 
require all referenced fields. This is preferable for iterative development since during 
construction of a BRG, if ever an additional piece of data is required firom a data file (for 
example) to implement a particular business rule, the data is available (e.g., using the above 
30 Expert language, "output-all-input-fields", which allows passage of all other data). 

Results fi-om ^the task performed by one node may provide input to another node. 
This may be done by graphically linking, in the BRG using the BRE (for example), one 
node to another by clicking on an output of one node and dragging it to the input of another 
node. The link defines the communication of data fi*om the output of one node to the input 
35 of another directly via, for example, TCP sockets. 
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Typically, each node is named according to the task the node is performing, so that 
a user can quickly determine the task of a particular node. In that regard, a user dejBnable 
parameter for naming or labeling the node may be included, where a user may simply type 
a name. In some embodiments, dynamic labeling of nodes may be included, in that, a label 
5 may be a short description determined from parameters of the node. For example, a sorting 
node that sorts data on the column "CustomerlS" could be adequately labeled with "Sort on 
CustomerlD". Each type of node may define a specific dynamic labeling technique, either 
through scripting or through textual substitution (see below) on a particular parameter 
name like "Custom Label" for example. In such a case, defining a parameter "Custom 
10 Label" = "Sort on { {^Sort Column^} }" accomplishes this automatically. Accordingly, if 
the parameter "Sort Column" is altered, the dynamic label may be altered instantly. 
Through a preference control, a user may turn dynamic labeling off. 

Preferably, every node is associated with a particular node type, of a plurality of 
types of nodes provided in the primitives of the BRE, which determines the node's general 
15 function. Types may be defined in at least one of three ways: by a file, by a local library, 
and/or by a shared library. Those types that are defined in files may be the nodes that are 
associated with the primitives (i.e., commonly used nodes for BRGs) in the BRE. Such 
primitives may include: aggregate, composite, cat, Dbloader, filter, infile, join, lookup, 
query-dump and sort. 

20 A node may comprise either a simple node, which may use a single binary or script 

to perform a particular action(s), or a composite node which may be defined by multiple 
nodes in a sub-BRG (for example). This recursive composition allows management of the 
complexity in large BRGs - a well-composed BRG using composite nodes is typically 
much easier to understand, edit, and debug than a BRG where all nodes are visible at once 

25 (e.g., a monolithic script). 

Composition of simple nodes into a composite node may be accomplished by 
combining two or more nodes (base nodes), along with their interconnections, into a single 
node via a second or sub-BRG. A user can select a number of inputs and outputs 
associated with the base nodes of a composite node for use as inputs/outputs of the 
30 composite as a whole. A composite node may also be considered a pseudo node: in and of 
itself, a composite node performs no computations. Rather the nodes that make up a 
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composite node determine the processing task(s) of the composite node, hi a BRG, a user 
can choose to "drill into" (see Fig. 3, "Graph drill-down") a composite node to see the 
configuration of the internal sub-BRG, to access the nodes that make up the composite and 
corresponding parameter values of each. It is worth noting that the composition of nodes in 
5 embodiments of the present invention may be analogous to an "integrated circuit". 

In the event that a node is contained within a composite nodei and requires a 
parameter value which has not been set, the value may be set on the composite node itself. 
In other words, setting a parameter on a composite node implicitly sets the parameter on all 
members of the composite where it has not been set. 

10 A library is a method for defining re-usable components (e.g. nodes) of one BRG, 

which may then be used in other BRGs by reference. BRGs are preferably setup to include 
an implicit library which is preferably stored in the same document as the BRG (or an 
associated document). In the case of library nodes, which may be either simple and/or 
composite nodes, each node may be available as a particular type (e.g., sort, aggregate, 

1 5 etc.). If the parameters of a library node are modified, the modification carry forth into 
every instance of the node used in every BRG. 

Using an inheritance fimction, a new library node may be created based on a current 
library node (parent node) and inherit the parameters and associated default parameter 
values of the parent node library node type. Each parameter, however, may be overridden 
20 in the new library node. In addition, a user may define new parameters and establish a new 
node type with a different interface (for example). Thus, new nodes may be created based 
on other nodes using the inheritance fiinction as a basis. This allows for easy reuse of 
fimctionality in BRGs, delivering time-savings and risk mitigation in creating and 
maintaining BRGs. 

25 In accordance with the inheritance fiinction, embodiments of the present invention 

may include rule for determining the setting of parameters in a node. For example, in one 
embodiment, the values for the parameters for a node may be sought out first from tlie 
particular node, then at the corresponding base (composite) node, then at a corresponding 
parent node, and finally, if a parameter setting is not found, it is sought at a BRG parameter 
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level. BRG or graph level parameters are particularly useful for setting "global" properties 
such as directory paths, database usemanies and passwords, and the like. 

Inherited parameter values for a new library node from a parent node may be color 
coded so that a user can easily determine whether such parameters values have been 
5 inherited from another node. For example, inherited parameter values may be in blue text, 
and locally modified parameter values may be in black text. In one embodiment, deleting a 
locally modified inherited parameter values automatically restores the inherited value of 
the parameter. 

When inheriting parameters from library composite nodes, it is often desirable to 
1 0 adjust the implementation of the composite node. For example, a library node may define 
a complex series of manipulations which are generally useful but in a particular single 
instance may not be quite right. Although one may copy and modify the composite node 
definition, it often leads to multiple sub-BRGs to maintain and clutters a Ubrary space with 
special case scenarios. Instead, using an augmentation process, the user can edit "shadow" 
1 5 nodes of the composite nodes. Shadow nodes represent instances of the internal 

implementation of the composite (i.e., the underlying nodes). Since alterations (e.g., 
additions and/or deletions) to a library composite node are instantly reflected in all 
derivatives, the shadow nodes provide a mechanism for interacting with and viewing the 
state of the elements of a library composite in a particular instance. Moreover, a user can 
20 override the parameter values of each of these shadow node, add new nodes to the 
composite node, disable shadow nodes, add new inputs and outputs or delete existing 
inputs/outputs, and alter the linking of the nodes within the composite node. Shadow nodes 
may be distinguished from explicitly instantiated nodes by a visual indication in the BRG, 
for example, by including a "shadow" behind the node. 

25 With regard to the linking of the shadow nodes, since it can be confiising as to 

whether a connection between two nodes is inherited or locally modified, the ERE may 
display such connections differently to distinguish between the two. For example, 
inherited connections may be a dashed blue, while explicit modified or local linking may 
be solid black. 
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As stated earlier, template BRGs may be created to accomplish predetennined 
tasks. When creating such template BRGs, it is often desirable to have multiple sub-BRGs 
implemented simultaneously to allow a compiler to automatically choose one 
implementation over another. Accordingly, a Bypass node may be used to facilitate this 
5 functionality and is particularly useful for creating composite nodes that use multiple 
sources of mutually exclusive or optional data. The Bypass provides a visual indication 
that two or more alternate paths can be defined as the source of a single "virtual" data path. 
The bypass node chooses a first input that can be "satisfied" to realize the virtual data path 
as its output "Satisfied" may be defined as a node being enabled and all of its inputs linked 
10 to other satisfied nodes. To that end, a Bypass node may be satisfied if it is enabled and at 
least one of its inputs is satisfied. 

In some embodiments of the invention, nodes may include a user-defined 
performance metric parameter. Such a parameter qualifies a node's eligibility to operate on 
a particular server. For example, a very large accumulator node may require a minimum of 

15 4 gigabytes of RAM to operate and only one member of a server farm includes that much 
RAM. Accordingly, some embodiments of the invention provide the ability to declare the 
performance metric(s), and associating these metrics with nodes in the ERG and with 
servers in the farm. Thus, when used on a particular node, the node will be restricted to 
being assigned by the controller only to a server that has the required minimum metrics. In 

20 the event that two or more servers are eligible to riin a node, the one with the best metrics 
(fi:om die point of view of the node) may be chosen. 

The value of a parameter can be partially or completely specified through a textual 
substitution mechanism. Syntactically, textual substitution may be indicated by a character 
prefix and suflFix. For example, the prefix may be "{C^", and a suffix may be "^}}". 

25 Between the prefix and the suffix, a user can enter the name of a parameter. The value of 
this parameter may tlien be substituted in place of the text firom the prefix to the suffix. 
The parameter may be evaluated using previously defined parameter inheritance rules 
stated above (i.e. check the node, then its base node(s), then its parent node(s), then the 
BRG level parameters). In the event that none of these are set, the BRE may prompt tlie 

30 user to set a BRG level parameter. If the user refuses, then the operation necessitating the 
substitution (typically execution or compilation) may be cancelled. However, instead of 
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demanding a value, the user can include a default value in the textual substitution request 
by following tlie parameter name with a specified character (e.g., "=") followed by a 
default value. If a blank value is acceptable, then the ' may be followed immediately by 
the suffix. 

5 According to some embodiments of the invention, textual substitution may be used 

with respect to the Boolean evaluation of whether a particular input or a node is satisfied. 
For example, if a syntax between flie prefix and suffix of a two-character sequence 
(for example) is found, then any text before the "»" may be determined as an input name 
or number. Any text following the may be determined to be a node name. Either can 
10 be blank, but preferably, not both. 

The evaluation of the Boolean value proceeds by locating a node that matches the 
description. Accordingly, first the node where the substitution is required is examined. If 
the description cannot be found there, siblings of the node may then be examined, then 
analysis of the parent node (and so on). When the correct node is located, the Boolean 
15 value is returned as to the specified node or input being satisfied. 

Textual substitution may be performed to specify user defined values to be 
incorporated directly into the source code, to define how user-defined parameters alter the 
behavior of a node, since embedded source code for Expert language corresponds to a 
multi-line parameter,. 

20 By default, a node in a BRG is enabled, with an enabling attribute being, for 

example, a Boolean parameter. This parameter may be set explicitly, though inheritance or 
containment. As well, textual substitution may be used to define the value of **Enabled". 
This feature allows nodes to be enabled/disabled on the basis of how other parts of the 
BRG are connected or satisfied (for example). 

25 A node is, by default, also not mandatory to a BRG, and a mandatory attribute may 

be an ordinary Boolean parameter. As such, it can be set exphcitly, through inheritance or 
contaimnent. As well, textual substitution can be used to define the value of "Mandatory". 
'A mandatory node may include two special properties. First, if it cannot be satisfied, then 
attempts to compile the BRG into a standalone application will fail (where a suitable error 

30 message may be displayed). Second, an optional request to compile only mandatory nodes 
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will elide any node (hat is neither mandatory nor needed by a downstream mandatory node. 
This provides an effective way to include debugging nodes in a BRG without compiling 
them for production. 

For the parameters "Enabled" and "Mandatory" it may be sometimes necessary to 
combine multiple booleans. These parameters support boolean expressions in, for example. 
Expert syntax style, i.e. (and x x x) (or x x x) (not x). By default, the "Mandatory" 
parameter is always "anded" with the "Enabled" parameter For example, a given database 
loader might be Enabled and Mandatory if 

1) DatabaseLoading is true at the BRG level; 

3) CustomerServiceRecords are connected and satisfied; and 

4) SkipSlowSteps is false. 

Thus, one may use boolean operators to combine these as follows: 

Enabled = (and {{'^DatabaseLoading'^}} {{''»CustomerServiceRecords'^}} 
(not {{'^SkipSlowSteps'^}}) 

Mandatory = true 

Node Types 

The following is an exemplary list of node types for use with embodiments of the 
invention. Please note that this list is not meant to limit the scope of the invention, but 
rather to give examples of the types of processes that may be setup for a node. As stated 
above, each node may include Expert language to perform particular tasks (e.g., to 
structure output for a next node process). Moreover, some of the node types listed below 
are directed to processing and/or validating data fi-om a telecommunications system for 
revenue assurance and is meant as an example only and is not intended to be limited to 
such. 

Accum: this node receives a data set and groups the output data set according to the 
accumulator specified in Expert. This node may be usefiil for calculating counts and sums 
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on a data set. Works like Agg (see below); see also Accum-output and Define-accum. This 
node may include one input and one or more outputs. 















OutputExprFile 


Minefile 


Expert expressions that define output structures. 



5 For example, having a record set with two fields, where the first field has an 

accoimt number and the second field has a TN (telephone number), an Accum node may be 
used to group an output file by account number and add a field indicating the number of 
TNs for each account id. 

Agg: this node receives a data set and groups the output data set depending on the 
10 aggregator specified in the AggExprFile attribute. This node may also be useful for 
calculating counts and sums on a data set. The input data is grouped (sorted) by the 
specified aggregator. This node may include one input and one or more outputs. 

An Is-agg-done is a value that can be used within the context of an ^gg node that is 
preferably maintained at a system level, hi other words, there is no need for the 
15 user to update or reset the value. This is a Boolean value that will be true if the 

current line (input record) is the last Hne of a group that is determined by the value 
of tho AggExprFile attribute, otherwise its value is false. If the AggExprFile 
attribute is set to 1 , for example, then the aggregate is the whole input data set. This 
provides a method of determining when the end of an input data set is reached. 



?5>^aM6tersS:Sg^^i^?» 


Name . • 




'besGriptiqnv'^ ^.v .r^;'^.': --f.-vi^;:^-; v 


OutputExprFile 


Lilinefile 


Expert expressions that define output structure 
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AggExprFile 


Inlinefile 


DejSnes the fields to group the output by (preferably one 






ouput). 



For example, if a record set includes two fields, tlie first field is an account number 
and the second field is a TN, the Agg node may be used to group an output file by account 
number and add a field indicating the number of TNs for each accoimt id. 



Binary: this node may be used to execute a binary executable file. The binary 
5 executable is deployed, for example, in the appropriate directory on a back-end server. 
This node may include zero (0), one (1) or multiple inputs and/or outputs. 











Binary 


String Path and name of the binary file to be executed. 



Bundler: this node may be used to combine multiple sources of input that all have 
the same format and creates one output source (see Fig. 7; node 710). The 

10 parameters/fields for this type of node are inputs and one output. This node is usefiil as a 
visual aide for BRGs where there are a large number of inputs and outputs associated to 
one node exist, which would clutter the ERG. A bundler node is similar to a composite 
node, but it is composition of data rather than a composition of operators. Before data 
streams can be accessed, however, a bundler node must be linked to a pseudo node called 

15 "unbundler". Bundlers and unbundlers are analogous to male and female multi-pin 
connectors in electronic devices. 

For example, this node may be used within the BRG that makes up a composite 
node, where the end result of a composite node is a large number of outputs. Thus, the 
outputs can be bundled up within the composite node's sub-BRG so that a single source of 
20 output can be shovm. On the BRG where the composite node resides, the output of the 
composite node is sent to an Unbundler node (see below), where the respective outputs are 
broken down. 
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Cat: this node may be used to combine data sets, and may include one or more 
inputs and an outputs. 





:;Nain&>-r>^ii';^^ 






stripHeaders 


String 


A value of "true" populated here will drop the column headers 
from the output data. 


catType 


String 


"union" takes all columns from all of the inputs, "intersection" 
takes all of the colunms that are in all of the inputs and "exact" 
requkes that all of the inputs have the same colunms. 



Example: having input data consisting of three (3) input sources, where each 
source has one record and each source has one field named circuity county a resulting 
single output data set will include 3 rows, where each of the rows contains a circuitjcount 
value from a respective input source. 

Clock: this node defines sequential dependencies between the executions of nodes 
within aBRG. This node is preferably for display purposes as other fimctionalitycmay be 
established using other nodes. For example, there may be a number of SQL statements that 
require execution in a certain sequence, where the structure of a BRG does not explicitly 
dictate the sequence. In such a case, one could associate the nodes in question by using a 
Clocks node. 

As shown in Fig. 5, clicking on the clock node attached to a first node and then 
dragging the mouse over to the second node in the sequence dependency creates a 
dependency line. Thus, as shown, the "Filter desired jurisdictions. . 520 node must 
complete execution prior to the "Prepare for ICTA" node 530 to start execution. 

CombineLineResultsFiles: this node may be used to combine a set of line level 
files from the directory specified in a ResultsDirectory node parameter into a Ubrary that is 
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Specified in the Library node parameter. This node may include one (1) input and zero (0) 
outputs. 















ShadowFileName 


String 


File that is used to store temporary state information 
during the execution of this node. This value should be 
the same as the file initialized in an 
InitializeCombineLineResultsFiles xioAq, 


Merge 


String 


"true" means that the records being processed are in a 
sununarized format where usage data has been 
grouped together for a particular WTN (working 
telephone number). 

"false" means that the records being processed are in a 
raw call-by-call fonnat and have not been aggregated 
by WTN. 


Library 


String 


Directory where the output is placed 


ResultsDirectory 


String 


Specifies the directory of the input to the node, this is 
the directory where the output fi-om the usage proc 
execution resides 



Composite: this node may be used to group other nodes togetlier visually and/or 
functionally; serving as a visual aide for BRGs where there are a large number of nodes 
that clutter the BRG. Thus, this node may include zero, one or multiple inputs and/or 
outputs. 

Convert: this node may be used to convert data that is in a non-tab delimited 
format into a tab-delimited format (for example). This is similar to an Infile node (see 
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below). Preferably, the data should already have a header. The node generaUy may 
include zero inputs and one ou^ut. 













InDelimiter 


String 


Input file delimiter 


Convertfile 


String 


Identifies the input file to be converted. 



ConvertNonBrain: this node may be used to append field names to die top of each 
column of a file that has no headings, and may also be used to convert data to predefined 

delimited format. 











Header 


String 


Contains the headers to be added to each column 
separated by commas. For example the value might be » 
file,date,type 


InDelimiter 


String 


The symbol used to delimit the input file 


File 


String 


The path and name of the input file ' 



ConvertPositional: this node may be used to convert an input file of fixed width 
(no header) to a delimited format (similar to an Infile node; see below). Specifically, the 
specification for the format may include colon separated field entries, where each field 
entry is of the form name,start,size. This node may include zero (6) inputs and one (1) 
output. 
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Spec 


String 


The positional specification 


Positionalfile 


String 


Identifies the input file to be converted. 



Dbloader: this node performs data loads into a database (e.g., Oracle), and may 
include one input and zero, one or multiple outputs. 



^^^^^^^^^ 




'•'i 




DBUSer 


String 


The database usemame 


DBPassword 


String 


The database password 


DBService 


String 


The database instance name 


AbortThreshold 


String 


The number of rows that will be allowed to 
error out before rolling back a data load. The 
default value of this parameter is infinit>^ 


DbOutputName 


String 


The output name to be used for the data load. 


OutputExpr 


Inlinefile 


Expert language to define the output structure 
of the data load; the output fields created here 
should match the columns of the table being 
loaded. 
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pescnption ' _ ;y \ , 


DBTable 


String 


The table to be loaded with data 


MissingColtmnBehavior 


String 


Possible Values: {"error", "log", "ignore"}. 
This value defines the behavior of the system if 
a record that is about to be loaded is missing 
data fi-om a particular field in a table. 

Ignore - Do nothing, continue processing as 
normal 

Error — stop processing 

Log - log the discrepancy between the data to 
be loaded and the table structure, then continue 
processing 


ExtraFieldBehavior 


String 


Possible Values: {"error", "log", "ignore"}. 
This value defines the behavior of the system if 
a record that is about to be loaded contains a 
jQeld that is not defined in the destination table. 



Diff : this node may be used to generate PC/MOU discrepancies between two 
homogenous line level input files, and may include two inputs and one output. 





•jN^nev:r'';'::- 


'Typey- 


;Descripti(3n'..' ;.^ '-[^ f, '•} ■;■ 'v I ' :'\ :c- '.■ 


Zone 


String 


This is a descriptor string, which is appended on to the 
output records. Will typically be location based. 
Ex."BOSTON" 
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tP^s^crip^^ . ^'B-V:'/'; ;-^^^: T/:\';;^ ^^^^^^^^^^^^ 


rundate 


String 


The date that the particular usage records are from 


threshold 


String 


The average MOU difference allowed per call. 


excludejile 


String 


This is a list of WTNs to exclude from comparisons 


discrepencytype 


String 


"AMA", "Bill", this will be a string value that describes 
a the type of discrepancy being checked for 


Columns 


String 


hidicates which PC/MOU pairs to compare. . 



DirectoryList: this node may be used to scan a specified directory to find all 
contents that match what is specified (which may support wildcarding). The contents may 
be output to the output file under the column name FileName 











Spec 


String 


Tlie specification to use to scan the directory 


DirectoryName 


String 


The directory to scan 



Dummylnput: this node may be used to create a test input source consisting of one 
column and a specified number of rows with no data populated. A type may be specified 
by appending a :type identified after the name. 





Name / I'^^V.- 


Type . 


Description-"' ! ,v; /:V.;::l=v;:..;-^"-' ■ y^r-'^ -^l f 




String 


Colimin header 
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Numlines 



String 



Number or rows 



ExecuteSubgraph: this node is used to execute a BRX file associated with another 
BRG, and may include one input. 











Bi-xFileName 


String 


Specifies path and file name of the .brx file to be 
executed. The path of the file is server oriented. 



Fatfinger: this node may be used to compare two data sources, to find "near" 
matches of TNs (e.g., off by one). This node type may include two inputs and one or more 
outputs. 





•Name; 




Bescnptiou ; r > L^UZ^-^^ 


Inputfieldl 


String 


Specifies the first column to be compared 


Inputfield2 


String 


Specifies the second column to be compared 


Fieldmask 


String 


Specifies which digits to look at. For example, a value 
oixxxxxxllir would only look at the last four digits of 
the phone number. 


OutputExprFile 


hilinefile 


Expert language to detemiine the structure of the output 
data. 



wo 2005/043356 



-25 - 



PCT/US2004/038086 



FileCat: this node may be used to concatenate multiple files into one input source. 
This may be used with a FilesFroinLibraiy node to combine multiple sets of usage into one 
file. This node may include one input and one output. 











FilenameExpr 


String 


This denotes the coliunn header of the input file, which 
should hold a set of file names. This type of input file 
will come fi^om 2iFilesFrontLibrar)f node. This can also 
be an expression that uses the data in the input file to 
construct a filename. 



FilesFromLibrary: this node maybe used to retrieve a set of usage files and stores 
the file names in an output file. This node may include an output. 





;NaiAe>^^rS^' 






calldate 


String 


A string that describes the interval of the calls to be 
loaded, typically it might look something like the 
following:" 2003 120820031209" 


filedate 


String 


Date that the usage files were created 


Format 


String 


Format of the files to be retrieved, i.e. "AMA", "SS7" 


Type 


String 


Type of the files to be retrieved 


libraiy 


String 


Directory path of the library where the usage files 
reside 
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fileNameColumn 


String 


Header of the column in the output file that contains 






the file names of the retrieved usage files. 



For example, a sample set of input/output for a FilesFromLibrary node: 



filenamerstring 
/hosts/jigsaw- 

sun/raid0^drosen/Testing^GHNCMO84G/lib/20031208/20031209/AMA/C^ 
5 /hosts/jigsaw- 

sun/raid0^drosen/^esting/RLGHNCMO84G/lib/20031208/20031208/AM^ 

The above parameters produced the following output: 

filedate - 2003120820031209 
10 ca/Wfl/g -2003120820031209 

library - /hosts/jigsaw-sun/raid0/bdrosen/resting/RLGHNCMO84G/lib 
format - AMA 
type " CDRS 

fileNameColumn - filename 

15 



Filter: this node may be used to transform data using a simple pass through 
operation. For example, if mstructed, one column may be removed from the output file. 
This node may include one (1) input and one (1) or more outputs. 













OutputExprFile 


Inlinefile 


Expert language expression(s) to alto- the structure 
and/or contoit of the input file and produce an ou^ut. 



For example, a Filter node may take a usage file for a telecommunications system 



20 as an input and remove all records that do not have duration of greater than 5 seconds. 

FinalizeCorabineLineResuItsFiles: this node may be used to finalize population of 
a library performed by one or more previous CombineLineLevelResultsFiles nodes using a 
temporary state file that is referenced m the ShadowFileName node parameter. 
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Description.. . \ % :,.y;.r, \'^^^: >r;V i j/-' • \ / 
^- f ■ ..^ 


ShadowFileName 


String 


Temp file used to store state information associated to 
the activities of combining Line Level usage files. 
There should be an ImtiolizcCombmeLmeResultsFiles 
and a CombineLineResultsFiles node that also have 
the same value in this parameter. 



Herefile: this node may be used to introduce a datastream directly into a BRG 
instead of loading it from an external file or database. Specifically, a parameter of the node 
defines the particular data directly. 









^asaniev;;>ii:*;;s^:i;vrf^j. 






Herefile 


String 


Specifies particular data to be output. 


Infile; this node may be used to import data from a file into a BRG, and typically 
includes one output. 


al^nV(^fe%$M 










. •Description..;, ..^.i^Jv^V-.-. ,-' '■ ' "■' . ■^,v:;,^^^;'^i}.^•■^*^ 


Infile 


String 


Specifies the path and filename of the input data. 



InitializeCombineLineResultsFiles: tliis node may be used to initialize a 
temporary state file that is used by a CombineLineResultsFiles node when executed. 
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ShadowFile 


String 


The name of the temporary state file to be used, arbitrary 



Join: this node may be used to join two record sets based on predeterniLned criteria, 
populated in aJoinExpi'File parameter. The two inputs preferably must be in properly 
sorted order as specified by the Expert join expression in the JoinExprFile parameter. This 
node may include more than two inputs and may have one (1) or more outputs. 











JoinType 


String 


Possible Values = {1 (=left-outer), i, r(=right-outer), li, 
ri} 


JoinExprFile 


Inlinefile 


Expert language comparison statement, if the 
comparison made for a record returns a 0, then both side 
of the comparison are equal. Depending on the return 
of the comparison and the JoinType specified, a given 
record may continue to be processed so that it may be 
output in the Output expression defined in 
OutputExprFile 


OutpntExprFile 


Inlinefile 


Contains an expression that defines the output structure 



LineMatcher: this node may be used to detenmine Matched, UnMatched, Multiple- 
Matched lines/data in an input file. The node may output four streams: uniquely matched 
lines, multiply matched lines, luimatched lines, and matched ids. One use for a 
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LimMatcher node may be to remove duplicate call records from a set of usage data in 
validating data of a telecommunications system. 









Description \ ^ih '"''V/^^-Pi ' 


MustMatch Columns 


String 


An array of colunm names by which the input 
is sorted and which values must be identical 
for a match to occur. 


PrimaryRangedColumn 


String 


The name of the column that, contains values 
that will be used to perform the windowing 
(this is an algoritlmi that determines what the 
window of lines eligible for matching is). The 
input should be sorted by this column after the 
MustMatchColumns, 


MaxPrimaiyRange 


String 


This is the maximum difference between the 
values of the primary ranged column for two 
lines that the algorithm will consider to be a 
match. 


RangedColumns 


String 


An array of non-primary column names 
whose values must fall in a range for a match 
to occur. 


MaxRanges 


String 


An array of numbers that correspond to the 
ranges used for the RangedColumns 


ColumnsThatCannotMatch 


String 


An array of column names that must NOT be 
equal for two lines to match. 


LineldColwmi 


String 


The name of the column that uniquely 
identifies a line. 
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Lookup: this node is similar to a Join node but includes additional performance 
capabilities. For example, lookup nodes load the second of two inputs (for example) into a 
cache that allows for faster processing of data comparisons. This node may be used when a 
second data set is small (e.g., a block of reference data). This node may load all records 
5 from the first input to be processed in OutputExprFile. If a match is found in the second 
input, a variable $is-match-found will be true, otherwise it will be false. 

Lookup may be used for accomplishing "Inner join" and "Left join" operations. In 
the case of Inner join, join may result in the full Cartesian product of all of the matches in 
the second input, but lookup will result in one of the matches. Accordingly, it is 
10 recommended that the data in the second input be unique with respect to the keys to avoid 
any uncertainty in which data from the second input is available. This node may include a 
pair of inputs and one or more outputs. 















InputKeyExpressionFile 


Inlinefrle 


Expert language for indicating a key value to be 
compared, this may be a colunm name from the 
larger input that is not meant to be cached. 


LookupKeyExprFile 


Inlinefile 


Expert language for indicating a key value to be 
compared, this may be a colunm name from the 
smaller cached input that is being compared. 


OutputExprFile 


Inlinefile 


Expert language for defining output. Any 
records that pass the defined comparison test 
will be processed by the Expert code in this 
parameter. 



MergeSortedUsage: in the case of telecommunications data validation, this node 
1 5 may be used to receive a file with usage records sorted by WTN, which have MOU- 



wo 2005/043356 



-31 - 



PCT/US2004/038086 



paycount pairs. The output(s) of this node may be an aggregated sum of usage totals for 
each WTN in the input file. Preferably, the input file for this node is sorted. 

MultiMatcher: this node may be used to determine Matched/Unmatched lines/data 
fi'om multiple matched info. This node may output two (2) streams: uniquely matched 
lines and unmatched lines. Generally, this node uses a list of matched IDs as input fi-om a 
LineMatcher node and multiple matched Unes fi^om a LineMatcher node, and may include a 
pair of inputs and a pair of outputs. 











PrimaryRangedColumn 


String 


Name of the column than contains values that 
may be used to find the closest match 
between midtiply matched records. 


RangedColumns 


String 


An array of non-primary column names 
whose values are used to find closest match. 


ColumnsThatCannotMatch 


String 


An array of non-primary column names 
whose values are used to find closest match. 


LineldColumn 


String 


The name of the coliunn that uniquely 
identifies a line. 



Outfile: this node may be used to write an input to a specified file. 











OutFile 


String 


Filename to save to 
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Perlfunc: this node may be used to execute a Perl script, and may include zero (0), 
one or more inputs and/or outputs. 









Description''' . ^ - ' ; \ ^ /' '^ 


module 


string 


Perl module to be executed from 


function 


String 


Perl function to be executed 



Pythonfunc: this node may be used to execute a Python function, and may include 
zero (0), one (1) or multiple inputs and/or outputs. 













module 


string 


Python module to be executed from 


function 


String 


Python function to be executed 



Querydump: this node may be used to execute one or more SQL queries from a 
database (e.g., oracle) and provide the results as a virtual input. In general, a Querydump 
node will not have an input other than the virtual input, but may have one or more outputs. 



iiiiiitiiiii 










Description' ' , \^ ^ " . . / ^ 


DBUser 


String 


Oracle DBUserName 


DBPassword 


String 


Oracle DB Password 


DBService 


String 


Oracle DB Service 
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QueryFile 


Inlinefile 


Holds the SQL that queries the database, the SQL here 
does not need to be embedded within Expert code. 


OutputExprFile 


Inlinefile 


Expert language that defines the output for the data that is 
retrieved from the SQL in the QueiyFile field. 



Rotatefile: this node may be used to create a file with one line, containing a column 
per line in original file. In some embodiments, TypeDefault takes precedence over 
TypeColumn. If neither is set, string is the default 











NameColumn 


String 


This value should be equal to one of the column names of 
the input file. The value under this column for each row 
of the input file will turn into a column header on the 
output file. 


ValueColumn 


String 


This value ^hould be equal to one of the column names of 
the input file. The value under this column for each row 
of the input file will now be a field value. 


TypeColumn 


String 


This value should be equal to one of the column names of 
the input file. The value under this column for each row 
of the input file will now be the type of the column in the 
output (optional) 


TypeDefault 


String 


The type to use for all columns (optional) 



Example: 
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Input File: 
barrstring foorstring 
hello bye 
hello bye 

5 

NameColumn: foo 
ValueColumn: bar 

Output File: 
10 byerstring byerstring 
hello hello 



Sort: this node may be used to sort an input file by a specified field(s). If more 
than one input is used, the column types and order should be identical across all inputs. 











CompareOrder 


String 


Defines the field that we are sorting by. Records 
will be sorted in ascending order. 


Unique 


String 


If "true" (string value) is populated, duplicates are 
dropped. 


CompareOrderExpr 


Xnlinefile 


Expert language to determine comparison order 
(instead of CompareOrder). 



15 

Sqlrunner: this node may be used to execute SQL statements on a given data set 
and may be used, for example, to query the Oracle DB. Although this node is very similar 
to the Queiydump node, it is not typically as efficient. This node may be used to insert data 
into a database as well. If there is no input, the node is typically run once; if there is an 
20 input, it will run once per input line. See Fig. 6. 
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DBUser 


String 


Oracle DBUserName 


DBPassword 


String 


Oracle DB Password 


DBService 


String 


Oracle DB Service 


CommitFrequenc)f 


String 


How many records are processed prior to committing 
an SQL transaction on a given data set. This field is 
not required. 


OutputExprFile 


Inlinefile 


Expert language for an SQL statement 



Tail: this node may be used to remove records fi-om an end of a given input 

dataset. 











Number 


String 


The first X rows of an input data source will be written to 
an output, where X is the number entered in this 
parameter. 



5 UnbuDdler: this node may be used as a visual aide for BRGs where there are a 

number of inputs and outputs present that clutter a ERG. Unbundlers are typically used in 
conjunction with composite nodes (which typically includes a number of outputs). As 
shown in Fig. 7, in order to simplify a BRG visually, substantially all (or preferably all) of 
the outputs are loaded into a bundler node 710 so that a composite node can appear to have ' 
1 0 one output source as opposed to more than 1 0. 
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Accordingly, as shown in Fig. 8, a composite node includes a "single" output 810 
which is sent to an unbundler, which then breaks down all of the actual outputs and directs 
tliem to the appropriate nodes. 

UsageReader: this node maybe used to validate telecommunication data, for 
5 example, to process usage of a specified type, makmg the input fields in input available as 
input 1 and the fields of the CDR available as a virtual input 2. The following are the fields 
and types supported by the CDR input: 

FileNumber (representing the hne number of the current usage file fi-om input 1) 
OrigDisplayNumber - long integer 
10 TermDisplayNumber - long integer 
TemiResolvedNumber - long integer 

ComiectDate - integer 
ConnectTime - integer 
1 5 DisconnectDate - integer 
DisconnectTime - integer 

HoldSeconds - float 

CallType - integer (has constants for the possible values) 
20 Features - integer (bitfield of possible values, all of which have constants) 
ChargeType - integer (has constants for the possible values) 
BilHngNumber - long integer 
BillingSeconds - float 

Jurisdiction - integer (has constants for the possible values) 



25 



30 



35 



40 



OrigRateCenter - string 
OrigLATA - integer 
OrigState - string 
OrigCountry - string 

TermRateCenter - string 
TermLATA - integer 
TermState - string 
TermCountry - string 

PeerRateCenter - string 
PeerLATA - integer 
PeerState - string 
PeerCountry - string 

RecordingRateCenter - string 
RecordingLATA - integer 
RecordingState - string 
RecordingCountry - string 
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OrigOCN - string 
TermOCN- string 
PeerOCN- string 
5 RecordingOCN - string 

OrigCarrierCode - string 
TennCairierCode - string 
PeerCarrierCode - string 
10 RecordingCarrierCode - string 

OrigCarrierType - integer (has constants for the possible values) 
TermCarrierType - integer (has constants for the possible values) 
PeerCamerType - integer (has constants for the possible values) 
1 5 RecordingCarrierType - integer (has constants for the possible values) 

IXC - integer 

OrigRoutingNumber - long integer 
20 TemiRoutingNumber - long integer 

OrigEndOffice - string 
TermEndOffice - string 
Peer - string 
25 Recording - string 

RoutingType - integer (has constants for the possible values) 
RecordingPoint - string 
OPC - string 
30 DPC - string 

InboundTrunkGroup - integer 
InboundTninkGroupMember - integer 
OutboundTrunkGroup - integer 
35 OutboundTrunkGroupMember - integer 

SwitchDirection - integer (has constants for the possible values) 
CarrierDirection - integer (has constants for the possible values) 
SourceType - integer (has constants for the possible values)' 

40 

The following constants are also provided which will be used to test the values of 
certain of the fields of the cdr input: 



General 
45 %Other 

%Unknown 
%NotApplicable 
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CallTvpe 
%Local 
%LocalToll 
5 %LongDistaiice 
%LocalDirAssist 
%LongDistDirAssist 
%Eniergency 
%Free 

10 

Features 
%ThreeWayCall 
%AutoCallback 
%ForwardedCall 
1 5 %RemoteForwardedCall 
%OperatorAssisted 
%Duplicate 

ChareeTvpe 
20 %Normal 

%TollFree 

%PremiuinFee 

%CallingCard 

%Col]ect 
25 %CoinPaid 

Jurisdiction 
%IntraLATA 
%Intrastate 
30 %IntraLATA_Interstate 
%Interstate 
%lQtraNANP 
%Intemational 

35 Routing 
%Direct 
%Tandem 

Direction 
40 %Inbound 
%Outbound 
%Transit 
%Intemal 
%Extemal 

45 

Source Type 
%AMA 
%OCC 
%SS7 



r 
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%DUF 
•/oRetailBUl 

Carrier Type 
5 %UNKNOWN 
%OTHER 
%CAP 
%CLEC 
%GENERAL 
10 %IC 
%ICO 

%L_RESELLER 
%LEC 
%PCS 
15 %RBOC 

%RESELLER 
%ULEC 

%W_RESELLER 
%W]RELESS 



This node also supports four expert operators: npa, ttxx, line and FeatureSet. Npa, 
nxx and line yield the relevant portions of a passed in TN. FeatureSet is a bit operator that 
tests if a specified bit is present in the specified bitfield. 















InputFileNameColumn 


String 


The column name in the input file to use to get 
the usage filename 


ReaderType 


String 


The type of registered usage reader to use (i.e. 
AMA) 


UseSwitchMap 


String 


Whether or not to augment cdr data with lerg 
lookup data, (optional, default is false) 


OutputExprFile 


Inlinefile 


Expert language to operate an SQL statement. 
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BRXs 

As stated earUer, tlie BKE may compile a BRG into a BRX, which is an execution 
file which is executed by the controller using the server farm at a desired frequency. The 
controUer may be a command-line Java application that can be automated through cron or 
5 another similar utility (for example). Moreover, the BRE may function as a controller 
when BRGs are executed from within it. 

The controUer analyzes the BRX and distributes the task(s) of each of the nodes 
over available processing resources of the server farm, which uses drones to perform each 
of the tasks, preferably in a most efficient manner. Specifically, the controller may 
0 delegate work at a granularity of individual BRX nodes, and coordinate communication 
between drones executing tiie processes of interconnected nodes. When a drone completes 
a task, the controller may schedule the process of a next available node for execution on 
that drone. 



15 Creating a BRG 

Figs. 9-27 illustrate an example of creating a BRG using the BRE. hi this example, 
a BRG wiU be constructed to validate data from two inputs files and a database, 
concatenate the two mput files, sort the input files, join the data from the two input sources 
(files and database), filter the data from the join, aggregate the results and then load tiie 
20 results into a database table. One of skill in the art will appreciate that the following 

process is merely an example and is not meant to limit the scope of flie present invention. 
As shown in Fig. 3, a screenshot of the BRE, and Fig. 4, a screenshot of a BRG, various 
nodes may be selected from the primitives node library 310, but clicking on the desired 
button. 



25 



hi constiiicting a BRG according to the present example, as shown in Fig. 9, an 
hifile button maybe used to add Lifile nodes 910 and 920 into the BRG. hi addition, in 
this particular BRG, a Querydump node 930 is added to the BRG, each having a 
corresponding output 910a, 920a and 930a, respectively. These nodes serve to retrieve data 
from a file or database that the BRG will process/validate. Parameters of a node may be 
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changed by, for example, right-clicking on the particular node, which generates, for 
example, a popup window listing the particular customizable parameters for the particular 
node. As shown in Fig. 10, for an hifile node, the parameters may include notes 1010 to 
add comments about the node (e.g., which may automatically be displayed when the mouse 
is hovered over the node). As stated in the previous section, the location of the data file to 
retrieve is specified at 1020. Other parameters may be declared by clicking on a "declare 
parameters" button. For the Querydump node, login information 1110 (Fig. 1 1) for logging 
into the database having the desired data and query language 1 120 to perform a search of 
the database to retrieve specific data. 

Outputs and inputs may be managed in the parameters window as well, in that 
inputs and outputs may be added or modified (e.g., renamed) by clicking on the "Add 
Input" or "Add Output" button, which displays a popup window for each (see Fig. 12). 

Fig. 13 illustrates the addition of concatenate node (Cat) 1310 in addition to the two 
infile nodes and a querydump node. In the instant example, the Cat node concatenates data 
fi-om one of the Infile nodes and the querydump node. To integrate the Cat node with 
another node, an output of one of the Infile nodes is linked 1320 to the input of the Cat 
node (e.g., clicking on an output arrow on one node and dragging it to an input arrow of 
another node). 

The parameters of the Cat node may be modified. As shown in Fig. 14, headers 
may be stripped fi-om the data (entering "true"), and the type of concatenation may be 
specified (union, intersection, exact). A listing of the inputs and outputs of the node may 
also be displayed. In this example, the Cat node will be a union. 

During the process of creating a BRG, nodes may be executed at any time to 
determine (test^debug) if they are performing the requured task(s). During such an 
execution, the nodes and/or inputs and outputs may be color coded to indicate a status of 
processing. For example, unprocessed nodes may be include a first color (e.g., gray), 
nodes which are currently processing may include a second color (e.g., yellow), nodes 
which have successfiilly processed may include a third color (e.g., green) and those that 
have failed processing may include yet a fourth color (e.g., red). With regard to inputs and 
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outputs, particular colors may indicate if the input or output is connected, satisfied, 
missing, in process or complete. 

After any execution, whether to debug certain nodes or to execute an entire BRG, 
data results for each node may be displayed on the BRG. For example, line counts 1510 
(the number of data rows processed) may be displayed adjacent the node (or on the node, 
or via a hovering mouse) at the output (for example) (see Fig. 15). Displaying the results 
of the processed data may be accomplished via a button in the node properties window (see 
Figs. 16A-16C). 

As shown in Fig. 17, two sort nodes 1710, 1720 are added to tfie instant example: 
one to sort data JBrom the output of one of the infile nodes (1710), and another to sort data 
from the output of the Cat node (1720). Tlie parameters of each sort node may include a 
note area 1 8 1 0 (Fig. 18) to add notes about the node, a compare order area 1820 to define 
the field that is used for sorting (may be predefined to sort in a particular order - e.g., 
ascending), and an area to add in custom comparison logic 1830 using Expert language. In 
addition, a "unique" area 1840 may be included, which if 'true", duplicate data is 
eliminated. In the example shown in Fig. 17, "Name" is used for sorting the data (in 
ascending order). 

As shown in Fig. 19, a join node 1910 is added in the example, and defined to 
include a total of two inputs and three outputs, with the outputs: "Only in File 1", "Only in 
File 2" and "In both". Then, using Expert, the logic for the join may be drafted (see Fig. 
22). In this example, the following logic is used: (cmp^ '1 :Name' '2:Name'). This logic 
determines whether there is a match or not between the data results fi-om the sort nodes. 

hi Fig. 20, the user may indicate the join types - i.e., what records to include in the 
output: left (outer) output, right (outer) output and inner output ("lir"). Fig. 21 illustrates a 
Venn diagram illustrating these parameters: File 1 is a left join "L" - "Only in File 1"; File 
2 is a right join "R" - "Only in File 2", and Inner join "i" - "In both file 1 and file 2". 



* "cmp" is an example of a command that may be used in a scripting computer language to perfonn 
comparison between data. 
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A Filtering node 2310 is added to the example BRG in Fig. 23. The Filtering node 
may be used to transfomi file data using Expert. For example, a single column of data may 
be removed, or, in the case of validating teleconununications data, a usage file could be 
filtered to remove any records that do not have a duration greater than 5 seconds, for 
example. In the instant example, the "Only in File 1 output is linked to the input of the 
Filtering node, and is used as a simple pass through to illustrate the use of the node. To 
that end, Expert language to accompHsh such an output structure is: 

(output "outl" 

(output-all-fields) 

) 

As shown in Fig. 24, and Agg node 241 0 is added to process a data set and group 
the output data set depending on the aggregator specified (e.g., in an AggExprFile 
attribute). The Agg node is usefiil for calculating counts and sums on a data set In the 
instant example, the output of the Filtering node is wired to the input of the Agg node. 

Using Expert, the output of the Agg node is estabHshed as shown in Fig. 25. Also 
shown is the AggExprFile parameter which defines the fields to group the output. 
Preferably, the Agg node includes a single output. 

The results generated Jby the Agg node may be loaded into a database using the 
dbloader node 2610, as shown in Fig. 26 (the completed BRG): the output of the Agg node 
is wired to the input of the dbloader node. Fig. 27 shows a popup window for modifying 
the parameters of the dbloader node, with fields for specifying the particular database to 
store the data. Expert may be used to structure the output for storage on the database. In 
the instant example, all the fields produced by the agg node are stored in the database. The 
completed BRG is now ready for execution into a BRX so that is may be processed by a 
server farm. 



Other Features 

Debugging: While a BRG is being created, it may be "debugged" along the way. 
For example, using the BRE in a debugging mode, datastreams from each node may be 
written to a temporary file which may be tracked and fed back to a remote client 
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{^plication for examination by a user to determine how the BRG (or particular node) is 
performing. For efBciency, a predetermined number of rows of data (e.g., 10 rows) may be 
specified so that one need not retrieve an entire (large) file. 

Moreover, with regard to such temporary file storage, since such temporary files 
5 stored on a server during debugging can exceed the storage capacity of the server, an 
"aggressive" deletion process may be included in embodiment of the invention in which 
temporary files no longer needed by any node are deleted. Conversely, while a BRG is 
running, it may be desirable to retain downstream temporary files even thougji they are 
scheduled for deletion (or replacement). Accordingly, a "la2y" deletion process may be 
10 included in embodiments of the mvention. Using such a process, a temporary file is not 
deleted until the time that a node replaces it. 

Servers and Server Farms 

BRGs and BRXs may be executed on server farms. The servers may be any 
computer, e.g., multiprocessor, desktop PCs, anything in between, or a heterogeneous 
15 mixture. Embodiments of the invention may be written in Java, for example, so each could 
theoretically run on any platform (e.g., HP-UX/PA-RISC, Solaris/SPARC, Red Hat 
Linux/i386, and Win32/i386). A server farm may be any mixture of these platforms. 

While data is often conununicated firom the output of one node to the input of 
another ("Unking") dhectly via TCP sockets (for example), some files may be created to be 

20 used as temporary storage. For example, during BRG development, the BRE may direct 
one or more drones to write intermediate outputs to a file to aid in iterative development. 
In production mode, the controller may direci drones to use files to avoid potential 
deadlock scenarios (for example). As a result, each of the servers in a farm may require 
access to such files written by other servers. In addition, the same filename used by a 

25 drone on one server should be usable on every other server in the farm. 

This may be accomplished using a central file server with a volume mounted in a 
consistent location. Another option includes haying each server export a volume, and for 
each server to mount every other servers' volumes (in a consistent way). Each server may 
then be configured to write temporary data files to its local volume, using the standard 
30 path. For example: 
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/server-farm/server-l/ mount of server-1 volume 
/server-fami/server-2/ mount of server-2 volume 

/server-famd/server-i/ link to local volume 
5 ... 

/server-farm/server-n/ mount of server-n volume 

1 0 The foregoing description is considered as illustrative only of the principles of the 

various embodiments of the invention. Further, since numerous modifications and 
changes will readily occur to those skilled in the art, it is not desired to limit the invention 
to the exact construction and operation shown and described, and accordingly, all suitable 
modifications and equivalents may be resorted to, falling within the scope of the 

15 invention. 

The present application also incorporates by reference, in its entirety, the 
disclosure of the priority document for the present application, U.S. provisional patent 
application no. 60/516,483, filed October 30, 2004, entitled, **SYSTEM AND METHOD 
FOR IDENTIFICATION OF REVENUE DISCREPANCIES". 
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WHAT IS CLAIMED IS: 

y^X. A method for processing data using a graphical user interface of a computer system 
comprising: 

arranging a plurality of nodes in a graph, wherein each node represents at 
least one processing step for processing data by a processor and wherein at least one 
of the plurality of nodes comprise at least one data retrieval node for retrieving data 
for validation; 

establishing at least one output from substantially all of the plurality of 

. nodes; 

except for the at least one data retrieval node, establishing at least one input 
to each of the plurality of nodes; 

configuring one or more parameters of each node; 

linking at least one output of each of substantially all of the plurahty of 
nodes to an input of another node, each link representing a data flow; 

sequencing a dependency among the plurality of nodes; and 

. establishing processing logic in at least one node to process data in a 
predetermined mamier. 

2. The method according to claun 1 , wherein the data retrieval node comprises an 
infile node which retrieves data from a particular data file. 

3. The method according to claim 1 , wherein the data retrieval node comprises a 
querydump node for retrieving data from a query of a particular database. 

4. The method according to claim 1 , wherein the data retrieval node comprises a 
Herefile node for placing data into a graph. 
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5. The method according to claim 3, Avherein the querydump node includes 
information for identifying the database and query terms for performing a query on 
the database. 

6. The metiiod according to claim 5, wherein the querydump node furflier includes 
information for accessing the database. 

7. The method according to claim 1, further comprising executing one or more nodes 
of the graph-space. 

8. The method according to claim 1, further comprising executing the graph-space of 
the workspace according to the sequence dependency. 

9. The method according to claim 8, further comprising color-coding the one or more 
nodes according to a status of the execution of respective node. 

10. The method according to claim 9, wherein the status of the node comprises 
unprocessed, processing, successfully processed and failed processing indicators. 

1 1 - The method according to claim 8, further comprising displaying results of the 
graph-space execution. 

12. The method according to claim 1 , further comprising creating a composite node for 
the graph-space, wherein the composite node represents a grouping at least a pair of 
the plurality of nodes. 

13. The method according to claim 1, further comprising setting one or more 
parameters of one or more of the plurality of nodes. 

14. The method according to claim 1, wherein establishing logic comprises including 
one or more expressions, statements, and/or operators. 

15. The method according to claim 14, wherein the statements may be selected from the 
group consisting of: variable related statements, output related statements, database 
related statements, procedural statements. 
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16. The method according to claim 14, wherein the operators may be selected from the 
group consisting of numerical operators, logical operators, comparison operators, 
conditional operators, null operators, string operators, date and/or time operators, 
and list operators. 

y 1 7. A computer readable media having computer instructions for enabling a computer 
system to perform a method for validating data using a graphical user interface of a 
computer system, the method comprising: 

defining one or more parameters of a graph-space; 

arranging a plurality of nodes in a graph-space, wherein each node 
represents at least one processing step to be performed to validate data and wherein 
at least one of the plurality of nodes comprise at least one data retrieval node for 
retrieving data for validation; 

establishing at least one output from each of the plurality of nodes; 

except for the at least one data retrieval node, establishing at least one input 
from each of the plurality of nodes; 

configuring one or more parameters of each node; 

linking at least one output of each of substantially all of the plurality of 
nodes with an input of another node; 

sequencing a dependency among the plurality of nodes; and 

establishing processing logic in at least one of the plurahty of nodes to 
process data. 

18, The media according to claim 17, wherein the data retrieval node comprises an 
infile node which retrieves data from a particular data file. 



19. 



The media according to claim 17, wherein the data retrieval node comprises a 
querydump node for retrieving data from a query of a particular database. 
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20. The media according to claim 1 , wherein the data retrieval node comprises a 
Herefile node for placing data into a graph. 

2 1 . The media according to claim 1 9, wherein the querydiimp node includes 
information for identifying the database and query terms for perfomiing a query on 
the database. 

22. The media according to claim 2 1 , wherein the querydump node further includes 
infomiation for accessine the database. 

23. The media according to claim 1 7, further comprising executing one or more nodes 
of the graph-space. 

24. The media according to claim 17, wherein the method further comprises executing 
the graph-space of the workspace according to the sequence dependency, 

25. The media according to claim 24, wherein the method further comprises color- 
coding the one or more nodes according to a status of the execution of respective 
node. 

26. The media according to claim 25, wherein the status of the node comprises 
unprocessed, processing, successfully processed and failed processing. 

27. The media according to claim 24, wherein the method further comprises displaying 
results of the graph-space execution. 

28. The media according to claim 17, further comprising creating a composite node for 
the graph-space, wherein the composite node represents a grouping at least a pair of 
the plurality of nodes. 

29. The media according to claim 1 7, wherein the method further comprises setting one 
or more parameters of one or more of the plurality of nodes. 

30. The media according to claim 17, wherein the method further comprises setting one 
or more expressions, statements, and/or operators for one or more nodes. 
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1 . A system for processing data using a graphical user interface of a computer system 
comprising: 

arranging means for arranging a plurality of nodes in a graph-space, wherein 
each node represents at least one processing step for processing data and wherein at 
least one of the plurality of nodes comprise at least one data retrieval node for 
retrieving data for validation; 

establishing means for establishing at least one output from substantially all 
of the plurality of nodes and for establishing at least one input to each <Jf the 
plurality of nodes, except for the at least one data retrieval node; 

configuring means for configuring one or more parameters of each node; 

linking means for linking at least one ou^ut of each of substantially all of 
the plurality of nodes with an input of another node, each link representing a data 
flow; 

sequencing means for sequencing execution of one or more nodes; and 

setup means for setting up processing logic in at least one node to process 
data in a predetermined manner. 

/32. A system for processing data using a graphical user interface of a computer system 
comprising: 

an editor including a graphical user interface; 

a graphical workspace for designing a processing graph having a plurality of 
processing nodes; 

an execution file, wherein the execution file results from compiling the 
processing graph; and 



a controller for directing the running of the execution file on one or more 
computers. 
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33. The system according to claim 32, wherein the one or more computers comprises a 
server farm. 

34. The system according to claim 33, wherein the server fenn includes one or more 
drones each for operating a process of one or more nodes. 

^5. An application program having computer instructions for enabling a computer 

system to perform a method for validating data using a graphical user interface of a 
computer system, the method comprising: 

defining one or more parameters of a graph-space; 

arranging a plurality of nodes in a graph-space, wherein each node 
rqwesents at least one processing step to be performed to validate data and wherein 
at least one of the plurality of nodes comprise at least one data retrieval node for 
retrieving data for validation; 

establishing at least one output from each of the plurality of nodes; 

except for the at least one data retrieval node, establishing at least one input 
from each of the plurality of nodes; 

configuring one or more parameters of each node; 

linking at least one ou^ut of each of substantially all of the plurality of 
nodes with an input of another node; 

sequencing a dependency among the plurality of nodes; and 

establishing processing logic in at least one of the plurality of nodes to 
process data. 
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